As with nonparametric tests, also for parametric tests the aim is to check whether or not a specific sample can come from a population with specific parameters. But in the case of the parametric tests, most of the time this parameters are more obvious than in the case of the non-parametric ones. For example, if we compare the estimate of the mean of the population from two samples, which is the heart of the t-test, it is quite clear, that we treat this statistical test is a bridge from our sample to the population.
The building material of the speech is in the case of the parametric test more sophisticated. To write this metaphor even further, you could say that this building material is stronger due to technical sophistication, and therefore the resulting bridges are more stable. But only if the building material is solid itself. So there are higher pre-requisites for the selecting for example for the Golden Gate Bridge then for any trail bridge somewhere in the woods.
Also, because we make more assumptions about the population, it is more relevant if the sample is selected in the representative way to make the result of a parametric test valid. So in the interpretation of the results of a paramedic test, even more than in the case of the non-parametric versions, we have to keep in mind that we are using our sample, that is available to us, to estimate characteristics of the population, that is not available for us, especially in the case of archaeology. And also in archaeology especially, since most of the time we are working with a convenience sample, it is difficult to provide building material solid enough for parametric tests.
Nevertheless, because these tests are much more powerful in most cases then their non-parametric counterparts, there is a reason to use these tests. And that is also the reason, why in most natural scientific investigations paramedic tests play the most important role. While in the case of the nonparametric tests, no assumption about the distribution of the sample or the population is made, with parametric tests we make such assumptions and use this added information to come to more precise conclusions. Another possible disadvantage of parametric tests is in the case of archaeology is, that usually they require a higher sample size, then nonparametric tests. But especially, if it comes to metric variables and measured quantities, which we have in the field for example of morphometric’s, parametric tests are a powerful tool if used in the right way.
So in the case of the nonparametric tests, we have already explained that these tests do not require parameters of the sample and the underlying population to follow certain distributions or have certain values. With this, you also have already the definition of parametric tests. Here, we assume certain values are distributions to be given in the sample that we are investigating, or actually in the population that is underlying to sample. This assumption is narrow down the possibilities of events, giving us a smaller event space. With this, we can decide more easily if something has a 95% probability or not. In the case of the One Sample tests, most of the time it is assumed that the population follow certain distributions. Most of the time, At least understand that realm of statistics that can be applied to archaeology, this might be the so-called normal distribution, that we have learnt already in the chapter about distributions. But for other applications, other distributions might be required. When example here can be the t-test, that we will learn about later on, that requires the data to be normally distributed (among other things).
Especially in the case of the two sample test, often certain similarities between the two samples are required, so that the test can be applied. In the case of the ANOVA, variance homogenity is assumed, and it is necessary to contact this analyses in the correct way.
At last, nearly all parametric tests required variables that are metric scaled, so interval or ratio scaled. The reason for this is that most parametric tests use actual calculated parametres from the data, and if these parameters cannot be calculated, because this is not possible with this data quality, of course they do not make any sense. When example here might be DF test, which is a test for variances. As the variance is defined as the deviation from the mean, it can only reasonably be expressed in situations where we can calculate the mean. As we have learnt earlier, the main can only be calculated with metric variables.
This means, that before we can apply an actual parametric test, often we have to determine whether our data quality is suitable for this test in the first place. And if this tests again have pre-requisites, this also might be necessarily checked with another test. From this, quite often cascades of tests emerge. Below, you can see a cascade for the application of the t-test.
There are multiple versions of this kind of decision tree, and you find legions of them in the Internet or in the literature. We will start following the path from this decision tree, to end up with our t-test. At first, we will check for normality, then we will test our data, if the variances are homogeneous, and lastly we will apply the t-test to our data.
There are different ways to test for normality of data. One rather conventional option is the Kolmogorov-Smirnov test, that we have seen before as a 2-sample test for comparing different distributions. And we have learnt, that it is a non-parametric test. Of course, it also can be used to compare the sample distribution to a theoretical one, making it one simple goodness of fit test. But since it is very conservative, we will also use the Shapiro-Wilk test as well as optical representation of normality, namely the QQ plot.
We will start with the good old Kolmogorov Smirnov test. As an example, we will use a simulated dataset of the length of different silex blades. We are interested in differences of the length between two sites. If between these two sites a significant difference would exist, this could give us information about different production or use strategies that the occupants of the individual sites had in mind when preparing these blades.
We will start with a reduced selection from the actual dataset. Below, you can find the code that you can run to get this selection into the R environment without loading a dataset.
blade_length<-c(14.9, 24.0, 8.7, 29.3, 25.5, 23.9, 22.4,
12.7, 8.7, 25.1, 25.6, 14.7, 23.0, 23.2,
26.5, 11.1, 15.2, 20.6, 20.1, 25.1)
No, we directly can run the Kolmogorov Smirnov test. In the two sample version we have seen that we have at first input the first sample, and as the second input the second sample. In the case of the one sample test, there is no second sample. Instead, here we can give the name of the distribution that you would like to test against. More precisely, here you name the density function of this distribution as you would do if you would like to compute it using R. So in case of the normal distribution, the string that is given a second parameter would be "pnorm". Depending on the distribution, that is used in the goodness of fit test, further parameters might be necessary. These parameters are the same as you would use to calculate the density function in our. If you are unsure, you can look this up in the respective help files (in our example the help file for pnorm). Normal distribution is defined by the mean and the standard deviation. Of course, we would like to compare our specific sample to a dataset that has the same mean and standard deviation of our dataset (but follows a normal distribution). For this reason, we give us the mean for the normal distribution the mean of our sample, and likewise as the standard deviation for normal distribution the standard deviation of our sample. The resultant command would be this:
ks.test(blade_length,"pnorm",mean(blade_length),sd(blade_length))
## Warning in ks.test(blade_length, "pnorm", mean(blade_length), sd(blade_length)):
## ties should not be present for the Kolmogorov-Smirnov test
##
## One-sample Kolmogorov-Smirnov test
##
## data: blade_length
## D = 0.19659, p-value = 0.4221
## alternative hypothesis: two-sided
In the resulting output, you can see different things: at first, it is written that is a “One-sample Kolmogorov-Smirnov test”. Then our dataset is named, then the D value and the p-value are shown as in the two sample case. Here, the P value is 0.42, so the result is not significant. Our distribution does not differ significantly from a normal distribution. Additionally you will see a warning, stating that there should be no ties or “Bindungen” in the data for the KS-Test. This means, that different data should not have equal values. Of course this is true, but we can’t do very much about this because these are the data that we have. Nevertheless the result should be okay for the Kolmogorov Smirnov test. This would mean, that we can proceed further and use this dataset for the following tests.
But, as you remember, there are two reasons, why a test might not be significant: Either, the reason is that the differences are really small compared to the null hypothesis, or, the sample size is too small. And in this case, it is quite likely that the sample size is actually rather small. Also, the KS test is known to be a rather conservative test. That means, that the no hypothesis is not easily rejected using this test.
The KS test represents a general test that can be used in both the one sample and two sample case, and in the one sample case it is also a general goodness of fit test, meaning, that you can test against different other distributions. This makes this test very versatile for different applications. But for normal distribution, there are more specialised tests that can’t do much more than testing against normality, but therefore they can do this very well. One of the examples here is the Shapiro Wilk test. Discussed generally has a higher power when it comes to normality check. But this is it’s only application, it is not a general goodness of fit test. Normality is what we are interested here in, so let’s see how we can apply this test.
requirements: \(x_1 ... x_n\) is a independend sample of a metric scaled variable
\(H_0\) The population is normal distributed
\(H_1\) the population is not normal distributed
Its specific benefits is, that it has good test power also in cases of small sample sizes below 50. Since it is doing only one thing and this thing well, there is not very much that can be customised with the shapiro.test(), as it is named in R. Actually, the command does not accept any other parameter beside the dataset that should be tested against its normality, and this dataset should be a numeric vector. So, let’s run this test with our blade length selection.
shapiro.test(blade_length)
##
## Shapiro-Wilk normality test
##
## data: blade_length
## W = 0.90199, p-value = 0.04494
You can see, that this test now results in a significant p-value. Well, to be honest, it is just significant. No, we have to decide, if we accept our dataset for further tests that require normality, or if we reject this dataset for parametric tests and apply to non-parametric variant. Here are some rules of thumb: There are problems with a very small and very large sample sizes. In case of very small sample sizes, the Shapiro test performs better. In case of very large sample sizes, Both tests will give you a significant result even if the difference from a normal distribution is very small. There is the other side of the law of the large numbers. If you have very many samples, then statistical test will see already very small differences as significant. This does not mean, that necessarily cannot use the t-test any longer: The t-test is known to be rather robust test against the violation of the normality assumption. Especially, if the sample size is larger than 50, we can ignore more or less significant result.
Our dataset here has a sample size of 20. The full dataset has a sample size of 45, so it will be very close to this 50 border. Also, the test statistic itself does not tell us very much about how the distribution of our sample is different from the normal distribution. For this reason, often it is very helpful to visualise the distribution of our data. We already learnt different methods for doing so. In this case, for example a histogram or a density plot might be helpful, or a box plot. But there is a very specific visualisation especially designed for checking against normality, which we will get known in the next section.
The specific plot variant is called QQ plot, named so because it compares the quantiles of the actual to the normal distribution. But at first, let’s visualise the distribution of our values in a conventional way, so that we can better understand what the QQ plot is doing.
plot(sort(blade_length))
If we plot our values sorted, we can see that there is a gap in the range of the values between 15 and 20. With a normal distribution, you would expect most values near the mean. This is not given here.
hist(blade_length)
plot(density(blade_length))
This also can be visualised using the histogram or the density plot. In the histogram, the gap between 15 and 20 becomes obvious. Also, it becomes obvious, that there are more values in the higher range of the total value range then in the lower one. The density plot confirms this interpretation, it shows us a bimodal distribution.
boxplot(blade_length)
The box plot looks fairly normal except for the fact, that the median is shifted towards the higher range of the spectrum, confirming what we already know.
No, let’s make a QQ plot and compare it to the plot options that we already had. The command for the QQ plot is qqnorm(). There is also a command qqplot(), but this is designed to produce a plot for two samples. Also, most of the time you would like to enhance the QQ plot with a line that makes the theoretical position of a normal distribution more visual. For this you use the command qqline().
qqnorm(blade_length)
qqline(blade_length)
The blooded self has the quantiles of the sample on the Y axis, and the quantiles of a normal distribution with the same parameters like our dataset on the X axis. The line represents, where a true normal distribution would end in this chart. If you see strong deviations from your actual data points in respect to the line, then you have strong deviations from the pattern of normal distribution. In our case, we can see an under representation in the centre of our distribution (the gap) and then an overall presentation of higher values, like we have seen already in the other plots. The lower end of a sample looks rather normal, except for this one outlier. So with this deviation and the test results from the individual tests, this dataset would probably be not suitable for a test that requires a normal distribution. In the following, we will use the full dataset, that follows the same pattern, but has a higher sample size. In this case, the deviations from the normal distribution could probably be ignored.
blade_length <- read.csv("blade_length.csv")
shapiro.test(blade_length$length)
##
## Shapiro-Wilk normality test
##
## data: blade_length$length
## W = 0.94408, p-value = 0.03045
qqnorm(blade_length$length)
qqline(blade_length$length)
Length of the handles of amphora of type Dressel 10 (Ihm 1978).
The length of the handles of different amphora are given. Test with the appropriate methods if the variable is normal distributed
file: henkel_amphoren.csv
Test for normality
At first, we load the data (You will hopefully realise by inspecting the file, that it is a continental CSV file, so that’s why we have to load it using the command read.csv2()):
dressel <- read.csv2("henkel_amphoren.csv")
Then, we performed a shapiro.test().
shapiro.test(dressel$henkellaenge)
##
## Shapiro-Wilk normality test
##
## data: dressel$henkellaenge
## W = 0.88495, p-value = 0.02174
Shapiro shows a significant deviation from normal distribution. Nevertheless, we should inspect the pattern visually:
qqnorm(dressel$henkellaenge)
qqline(dressel$henkellaenge)
There are two outliers in the data. Either, we could switch to nonparametric tests, or, If we can change our scientific question to be answered without these outliers, we could remove them from our dataset. Every time, that we exclude a value in an analyses, we should document this in the paper or manuscript, that this analyses forms a part of.
outliers <- c(which.max(dressel$henkellaenge),
which.min(dressel$henkellaenge)
)
shapiro.test(dressel$henkellaenge[-outliers])
##
## Shapiro-Wilk normality test
##
## data: dressel$henkellaenge[-outliers]
## W = 0.96026, p-value = 0.6067
qqnorm(dressel$henkellaenge[-outliers])
qqline(dressel$henkellaenge[-outliers])
No, that we have decided that our sample is reasonably normal distributed, we can proceed ensuring that other prerequisites are fulfilled. To decide, which kind of t-test we can’t apply here, we have to test whether or not we have homogeneity of variance between the two samples. This can be done using the F test. To test for homogenity of variances, there are other alternatives, for example the Levene’s test or the Brown–Forsythe test. These tests perform better, if the normality assumption of the F test is violated. The benefits of the F test is that it’s rather simple and easy to understand. So for our introduction here this test might be very suitable.
F Test for variance homogenity of two samples
requirements: two independend normal distributed sample of a metric scaled variable
\(H_0\) Both samples have the same variance (dispersion)
\(H_1\) Both samples have a different variance (dispersion)
Basic idea: if both variances are equal, their quotient should be 1
\[ s_1^2 = s_2^2;\ then\ \frac{s_1^2}{s_2^2} = 1 \]
The quotient will be compared to a tabled threshold (eg. Shennan) according to degree of freedom ( \(df_1 = n_1 -1; df_2 = n_2 - 1\) ) and the desired significance level.
If the calculated quotient > treshhold, than \(H_0\) will be rejected, otherwise not.
Significant: unequal variances
not significant: we can assume homogenuous variances (for further tests)
site 1; site 2
\(n_1 = 20; \bar{x}_1 = 20.015\)
\(variance\ s_1^2 = \frac{\sum^n_{i=1}{x_i - \bar{x}^2}}{n-1}=\frac{\sum^{20}_{i=1}{x_i - 20.015}}{20-1}=40.20871\)
\(n_2 = 25; \bar{x}_2 = 20.492\)
\(variance\ s_2^2 = \frac{\sum^{25}_{i=1}{x_i - 20.492}}{25-1}=33.0641\)
\(F = \frac{s_1^2}{s_2^2}=\frac{40.20871}{33.0641}=1.216\)
\(df_1\) = 20−1 = 19, \(df_2\) = 25 − 1 = 24; Sign.level=0.05
threshold at \(df_1\) = 19, \(df_2\) = 24, \(\alpha\)=0.05: 2.114
1.216 < 2.114; not significant, the variances do not differ significantly from each other
###F-Test in R
blade_length.csvvar.test(length~site,data=blade_length)
##
## F test to compare two variances
##
## data: length by site
## F = 1.2161, num df = 19, denom df = 24, p-value = 0.643
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.5185518 2.9822271
## sample estimates:
## ratio of variances
## 1.216084
Result is not significant, the variances are not significantly different
| class: inverse |
| ## Excercise F-test |
| (logarithmic) sizes of ritual enclosures at the Society Islands (Example by Shennan) |
| Given are the (logarithmic) sizes of ritual enclosures in two valleys at the Society Islands. |
| Please check whether the variances in both valleys are different! |
| file: marae.csv |
class: middle, center
Test for the comparison of the means of two samples.
If the means do differ significantly, it is assumed that both samples come from different population (in the statistical sense).
requirements: two independend normal distributed sample of a metric scaled variable with homogenous variances
\(H_0\) The populations of both samples have the same mean
\(H_1\) The populations of both samples have a different mean
Basic idea: If the means of both samples are within the standard error of the estimations of the mean of the according populations, than both populations could be potentially the same. Else not.
.pull-left[ ]
.pull-right[ ]
| ## t-Test, homogenuous variances [2] |
| ### Calculation ‘by hand’ |
| example blade length |
| site 1, site 2 |
| .pull-left[ \(n_1 = 20; \bar{x_1} = 20.015 ; s^2_1 = 40.20871\) \(n_2 = 25; \bar{x_2} = 20.492 ; s^2_2 = 33.0641\) |
| \(t = \frac{difference\ between\ group\ means}{variability\ of\ groups}\) |
| \(t = \frac{\bar{x_1} - \bar{x_2}}{SE_{(\bar{x_1}-\bar{x_2})}}\) ] |
| .pull-right[ |
| \(SE = \frac{s}{\sqrt{n}} = \sqrt{\frac{s^2}{n}}\) |
| \(SE_{(\bar{x_1}-\bar{x_2})} = \sqrt{\frac{S^2}{n_1} + \frac{S^2}{n_2}}\) |
| \(S^2 = \frac{(n_1-1)*s_1^2 + (n_2-1) * s_2^2}{n_1 + n_2 - 2}\) |
| \(S^2 = \frac{(20-1)*40.20871 + (25-1)*33.0641}{20+25-2} = 36.22102\) |
| \(SE_{(\bar{x_1}-\bar{x_2})} = \sqrt{\frac{36.22102}{20} + \frac{36.22102}{25}} = 1.805517\) |
| ] |
\(t = \frac{20.015−20.492}{1.805517} = −0,26419\)
\(df = (n_1 − 1) + (n_2 − 1) = n_1 + n_2 − 2 = 20 + 25 − 2= 43\)
Looking up in table (eg. Shennan): df = 43; sig. level = 0.05
possible differences bigger - smaller, therefore two tailed question
threshold: 2.021
not significant, we can not reject the null hypothesis
There is no significant difference in the means of both samples, the could originate from the same population (statistically speaking).
Test for the comparison of the means of two samples.
If the means do differ significantly, it is assumed that both samples come from different population (in the statistical sense).
requirements: two independend normal distributed sample of a metric scaled variable with homogenous variances
\(H_0\) The populations of both samples have the same mean
\(H_1\) The populations of both samples have a different mean
.tiny[
t.test(length ~ site, data=blade_length, var.equal=T)
##
## Two Sample t-test
##
## data: length by site
## t = -0.26419, df = 43, p-value = 0.7929
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.118172 3.164172
## sample estimates:
## mean in group site1 mean in group site2
## 20.015 20.492
]
not significant, we can not reject the null hypothesis
| class: inverse |
| ## Excercise t-test |
| Again the (logarithmic) sizes of ritual enclosures at the Society Islands (Example by Shennan) |
| Given are the (logarithmic) sizes of ritual enclosures in two valleys at the Society Islands. |
| Please check whether both valleys are different in respect to the mean sizes of the enclosures! |
| file: marae.csv |
approximation of the results by correction of degrees of freedom
blade_length3.csv.pull-left[ .tiny[
blade_length3 <- read.csv("blade_length3.csv")
var.test(length ~ site, data = blade_length3)
##
## F test to compare two variances
##
## data: length by site
## F = 2.8288, num df = 19, denom df = 19, p-value = 0.02854
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.119663 7.146758
## sample estimates:
## ratio of variances
## 2.828774
] ]
.pull-right[ .tiny[
t.test(length ~ site, data = blade_length3)
##
## Welch Two Sample t-test
##
## data: length by site
## t = -1.8368, df = 30.941, p-value = 0.07586
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.394619 0.334619
## sample estimates:
## mean in group site1 mean in group site3
## 20.015 23.045
] ] —
Tests with lesser preconditions to the data often have lesser power
A <- rnorm(20)
B <- rnorm(20) + 0.5
.pull-left[ .tiny[
t.test(A,B, var.equal=T)
##
## Two Sample t-test
##
## data: A and B
## t = -2.0184, df = 38, p-value = 0.05065
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.217006680 0.001806626
## sample estimates:
## mean of x mean of y
## -0.163934 0.443666
] ]
.pull-right[ .tiny[
t.test(A,B, data=sim_data)
##
## Welch Two Sample t-test
##
## data: A and B
## t = -2.0184, df = 34.087, p-value = 0.05147
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.219312359 0.004112305
## sample estimates:
## mean of x mean of y
## -0.163934 0.443666
] ]
In most of the cases (with small sample size), the Welch test will be slightly more unsure to reject the null hypothesis.
The following is important for all statistical tests!
What, if one compares more than two groups?
Given are the hypothetical sizes of TRB sites in Schleswig-Holstein by region and wetness.
settlements <- read.csv("settlements.csv")
{{settlements$wetness <- factor(settlements$wetness)}}
head(settlements)
## size regions wetness
## 1 41.161 Steinburg humid
## 2 26.497 Kiel humid
## 3 39.894 Dithmarschen humid
## 4 17.783 Neumünster humid
## 5 36.158 Rendsburg-Eckernförde humid
## 6 27.181 Stormarn humid
Question: Do the sizes of the settlements differ significantly in relation to the wetness?
How to proceed?
Intuitive, but problematic answer: We test all groups against each other if there is a significant difference.
Problem: The more often we test, the more likely is a significant result ‘by chance’.
Example: We test 3 groups, therefore we need 3 tests:
medium ↔︎ humid, humid ↔︎ arid, arid ↔︎ medium
The probability, that the alternative hypothesis is wrong even with significant result, is with each test 0.05. With three tests, it becomes 0.15!
With 100 tests, the expectation value becomes 5.0 (meaning, we expect 5 tests to show a wrong positive significance)!
The whole sequence of tests is regarded as one test. Therefore, the total p-value should be 0.05. Therefore, we devide for the individual tests the 0.05 by number of tests, to get the p-value for the individual tests.
Example: We test 3 groups, therefore we need 3 tests:
medium ↔︎ humid: p-value = \(0.2869584\) humid ↔︎ arid: p-value = \(1.9394078\times 10^{-7}\) arid ↔︎ medium: p-value = \(3.4368554\times 10^{-5}\)
p.adjust(c(0.2869584, 1.939408e-07, 3.436855e-05), method = "bonferroni")
## [1] 8.608752e-01 5.818224e-07 1.031056e-04
The first result is not significant (admittedly, it was not before either).
dependent variable: the variable to test (here size)
independent variable: the grouping variable (here wetness)
.pull-left[ .tiny[
boxplot(size~wetness, data=settlements)
] ]
.pull-right[ .tiny[
summary(aov(size~wetness, data=settlements))
## Df Sum Sq Mean Sq F value Pr(>F)
## wetness 2 3564 1782.1 23.12 1.7e-07 ***
## Residuals 42 3238 77.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
]
There are significant differences, but where?
]
More detailled result:
We analyse the result as linear model
.tiny[
summary.lm(aov(size~wetness, data=settlements))
##
## Call:
## aov(formula = size ~ wetness, data = settlements)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.097 -7.252 1.236 4.627 21.244
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.388 2.267 4.141 0.000163 ***
## wetnesshumid 20.386 3.206 6.359 1.21e-07 ***
## wetnessmedium 16.880 3.206 5.265 4.48e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.78 on 42 degrees of freedom
## Multiple R-squared: 0.524, Adjusted R-squared: 0.5013
## F-statistic: 23.12 on 2 and 42 DF, p-value: 1.698e-07
]
The first group (arid) is the control group. The others are compared to this. If we are interested in the differences between medium and other soils, we have to change the controll group to ‘medium’.
changing the control group:
.tiny[
wetness.new<-relevel(settlements$wetness, ref ="medium")
summary.lm(aov(settlements$size~wetness.new))
##
## Call:
## aov(formula = settlements$size ~ wetness.new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.097 -7.252 1.236 4.627 21.244
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.268 2.267 11.587 1.16e-14 ***
## wetness.newarid -16.880 3.206 -5.265 4.48e-06 ***
## wetness.newhumid 3.506 3.206 1.094 0.28
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.78 on 42 degrees of freedom
## Multiple R-squared: 0.524, Adjusted R-squared: 0.5013
## F-statistic: 23.12 on 2 and 42 DF, p-value: 1.698e-07
]
There is a significant difference between the control (medium) and arid soils.
pairwise.t.test(settlements$size,settlements$wetness,
p.adjust="bonferroni")
##
## Pairwise comparisons using t tests with pooled SD
##
## data: settlements$size and settlements$wetness
##
## arid humid
## humid 3.6e-07 -
## medium 1.3e-05 0.84
##
## P value adjustment method: bonferroni
.tiny[ If we are interested in the influence of two grouping variables
.pull-left[
boxplot(size~regions, data=settlements)
]
.pull-right[
interaction.plot(settlements$wetness, settlements$regions,
settlements$size)
] ] —
summary(aov(size~wetness*regions, data=settlements))
## Df Sum Sq Mean Sq F value Pr(>F)
## wetness 2 3564 1782.1 29.331 3.08e-06 ***
## regions 13 1372 105.6 1.738 0.142
## wetness:regions 12 832 69.4 1.142 0.391
## Residuals 17 1033 60.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Notation in the formula
*: Interaction of the grouping variables are considered
+: grouping variables are considered as indipendent
All the flavours of ANOVA
one independent factor (grouping variable), one dependent variable (measurement)
two/multiple independent factor (grouping variable), one dependent variable (measurement)
two/multiple independent factor (grouping variable), multiple dependent variable (measurements)